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A Virtual Server Farm (VSF) is created out of a wide scale 
computing fabric ("Computing Grid") which is physically 
constructed once and then logically divided up into VSFs for 
various organizations on demand. Allocation and control of 
the elements in the VSF is performed by a control plane 
connected to all computing, networking, and storage ele- 
ments in the computing grid through special control ports. 
The control plane is comprised of a control mechanism 
hierarchy that includes one or more master control process 
mechanisms communicatively coupled to one or more slave 
control process mechanisms. The one or more master control 
process mechanisms instruct the slave control process 
mechanisms to establish VSFs by selecting subsets of pro- 
cessing and storage resources. 
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METHOD AND APPARATUS FOR 
CONTROLLING AN EXTENSIBLE 
COMPUTING SYSTEM 

"This application is a Continuation-In-Part of application S 
Ser. No. 09/502,170 filed Feb. 11, 2000, and claims priority 
from U.S. Provisional Patent Application No. 60/150,394, 
filed on Aug. 23, 1999, entitled EXTENSIBLE COMPUT- 
ING SYSTEM, and from U.S. Provisional Patent Applica- 
tion No. 60/213,090, filed on Jun. 20, 2000, entitled CON- 10 
TOOL PLANE FUNCTIONAL DESIGN, the content of 
which are hereby incorporated by reference in their 
entirety." 

FIELD OF THE INVENTION 15 

The present invention relates generally to data processing. 
The invention relates more specifically to a method and 
apparatus for controlling a computing grid. 

BACKGROUND OF THE INVENTION 20 

Builders of Web sites and other computer systems today 
are faced with many challenging systems planning issues. 
These issues include capacity planning, site availability and 
site security. Accomplishing these objectives requires find- 
ing and hiring trained personnel capable of engineering and 25 
operating a site, which may be potentially large and com- 
plicated. This has proven to be difficult for many organiza- 
tions because designing, constructing and operating large 
sites is often outside their core business. 

30 

One approach has been to host an enterprise Web site at 
a third party site, co-located with other Web sites of other 
enterprises. Such outsourcing facilities are currently avail- 
able from companies such as Exodus, AboveNet, 
GlobalCenter, etc. These facilities provide physical space 35 
and redundant network and power facilities shared by mul- 
tiple customers. 

Although outsourcing web site hosting greatly reduces the 
task of estab fishing and maintaining a web site, it does not 
relieve a company of all of the problems associated with 40 
maintaining a web site. Companies must still perform many 
tasks relating to their computing infrastructure in the course 
of building, operating and growing their facilities. Informa- 
tion technology managers of the enterprises hosted at such 
facilities remain responsible for manually selecting, 45 
installing, configuring, and maintaining their own comput- 
ing equipment at the facilities. The managers must still 
confront difficult issues such as resource planning and 
handling peak capacity. Specifically, managers must esti- 
mate resource demands and request resources from the 50 
outsourcing company to handle the demands. Many man- 
agers ensure sufficient capacity by requesting substantially 
more resources than are needed to provide a cushion against 
unexpected peak demands. Unfortunately, this often results 
in significant amounts of unused capacity that increases 55 
companies 1 overhead for hosting their web sites. 

Even when outsourcing companies also provide complete 
computing facilities including servers, software and power 
facilities, the facilities are no easier to scale and grow for the 
outsourcing company, because growth involves the same $o 
manual and error-prone administrative steps. In addition, 
problems remain with capacity planning for unexpected 
peak demand. In this situation, the outsourcing companies 
often maintain significant amounts of unused capacity. 

Further, Web sites managed by outsourcing companies 65 
often have different requirements. For example, some com- 
panies may require the ability to independently administer 



956 Bl 

2 

and control their Web sites. Other companies may require a 
particular type or level of security that isolates their Web 
sites from all other sites that are co- located at an outsourcing 
company. As another example, some companies may require 
a secure connection to an enterprise Intranet located else- 
where. 

Also, various Web sites differ in internal topology. Some 
sites simply comprise a row of Web servers that are load 
balanced by a Web load balancer. Suitable load balancers arc 
Local Director from Cisco Systems, Inc., BiglP from 
FSLabs, Web Director from Alteon, etc. Other sites may be 
constructed in a multi-tier fashion, whereby a row of Web 
servers handle Hypertext Transfer Protocol (HTTP) 
requests, but the bulk of the application logic is implemented 
in separate application servers. These application servers in 
turn may need to be connected back to a tier of database 
servers. 

Some of these different configuration scenarios are shown 
in FIG. 1A, FIG. IB, and FIG. 1C. FIG. 1A is a block 
diagram of a simple Web site, comprising a single comput- 
ing element or machine 100 that includes a CPU 102 and 
disk 104. Machine 100 is coupled to the global, packet- 
switched data network known as the Internet 106, or to 
another network. Machine 100 may be housed in a 
co-location service of the type described above. 

FIG. IB is a block diagram of a 1-tier Web server farm 
110 comprising a plurality of Web servers WSA, WSB, 
WSC. Each of the Web servers is coupled to a load-balancer 
112 that is coupled to Internet 106. The load balancer divides 
the traffic between the servers to maintain a balanced pro- 
cessing load on each server. Load balancer 112 may also 
include or may be coupled to a firewall for protecting the 
Web servers from unauthorized traffic. 

FIG. 1C shows a 3 -tier server farm 120 comprising a tier 
of Web servers Wl, W2, etc., a tier of application servers Al, 
A2, etc., and a tier of database servers Dl, D2, etc. The Web 
servers are provided for handling HTTP requests. The appli- 
cation servers execute the bulk of the application logic. The 
database servers execute database management system 
(DBMS) software. 

Given the diversity in topology of the kinds of Web sites 
that need to be constructed and the varying requirements of 
the corresponding companies, it may appear that the only 
way to construct large-scale Web sites is to physically 
custom build each site. Indeed, this is the conventional 
approach. Many organizations are separately struggling with 
the same issues, and custom building each Web site from 
scratch. This is inefficient and involves a significant amount 
of duplicate work at different enterprises. 

Still another problem with the conventional approach is 
resource and capacity planning. A Web site may receive 
vastly different levels of traffic on different days or at 
different hours within each day. At peak traffic times, the 
Web site hardware or software may be unable to respond to 
requests in a reasonable time because it is overloaded. At 
other times, the Web site hardware or software may have 
excess capacity and be underutilized. In the conventional 
approach, finding a balance between having sufficient hard- 
ware and software to handle peak traffic, without incurring 
excessive costs or having over-capacity, is a difficult prob- 
lem. Many Web sites never find the right balance and 
chronically suffer from under-capacity or excess capacity. 

Yet another problem is failure induced by human error. A 
great potential hazard present in the current approach of 
using manually constructed server farms is that human error 
in configuring a new server into a live server farm can cause 
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the server farm to malfunction, possibly resulting in loss of 
service to users of that Web site. 

Based on the foregoing, there is a clear need in this field 
for improved methods and apparatuses for providing a 
computing system that is instantly and easily extensible on 
demand without requiring custom construction. 

There is also a need for a computing system that supports 
creation of multiple segregated processing nodes, each of 
which can be expanded or collapsed as needed to account for 
changes in traffic throughput. 

There is a further need for a method and apparatus for 
controlling such an extensible computing system and its 
constituent segregated processing nodes. Other needs will 
become apparent from the disclosure provided herein. 

SUMMARY OF THE INVENTION 

According to one aspect of the invention, the foregoing 
needs, and other needs and that will become apparent from 
the following description, are achieved by a method and 
apparatus for controlling and managing a highly scalable, 
highly available and secure data processing sites, based on 
a wide scale computing fabric ("computing grid"). The 
computing grid is physically constructed once, and then 
logically divided up for various organizations on demand. 
The computing grid comprises a large plurality of comput- 
ing elements that are coupled to one or more VLAN 
switches and to one or more storage area network (SAN) 
switches. A plurality of storage devices are coupled to the 
SAN switches and may be selectively coupled to one or 
more of the computing elements through appropriate switch- 
ing logic and commands. One port of the VLAN switch is 
coupled to an external network, such as the Internet A 
supervisory mechanism, layer, machine or process is 
coupled to the VLAN switches and SAN switches. 

Initially, all storage devices and computing elements are 
assigned to Idle Pools. Under program control, the supervi- 
sory mechanism dynamically configures the VLAN 
switches and SAN switches to couple their ports to one or 
more computing elements and storage devices. As a result, 
such elements and devices are logically removed from the 
Idle Pools and become part of one or more virtual server 
farms (VSFs) or instant data centers (IDCs). Each VSF 
computing element is pointed to or otherwise associated 
with a storage device that contains a boot image usable by 
the computing element for bootstrap operation and produc- 
tion execution. 

According to one aspect of the invention, the supervisory 
layer is a control plane comprised of a control mechanism 
hierarchy that includes one or more master control process 
mechanisms communicatively coupled to one or more slave 
control process mechanisms. The one or more master control 
process mechanisms allocate and de -allocate slave control 
process mechanisms based upon slave control process 
mechanism loading. The one or more master control process 
mechanisms instruct the slave control process mechanisms 
to establish IDCs by selecting subsets of processing and 
storage resources. The one or more master control process 
mechanisms perform periodic health checks on the slave 
control process mechanisms. Non-responsive or failed slave 
control mechanisms are restarted. Additional slave control 
mechanisms are initiated to replace slave control mecha- 
nisms that cannot be restarted. The slave control mecha- 
nisms perform periodic health checks on the master control 
mechanisms. When a master slave control process mecha- 
nism has failed, a slave control process mechanism is elected 
to be a new master control process mechanism to replace the 
failed master control process mechanism. 
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Physically constructing the computing grid once, and 
securely and dynamically allocating portions of the com- 
puting grid to various organizations on demand achieve 
economies of scale that are difficult to achieve when creating 
5 a custom build of each site. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention is illustrated by way of example, 
and not by way of limitation, in the figures of the accom- 
panying drawings and in which like reference numerals refer 
to similar elements and in which: 

FIG. 1A is a block diagram of a simple Web site having 
a single computing element topology. 
15 FIG. IB is a block diagram of a one-tier Web server farm. 

FIG. 1C is a block diagram of a three-tier Web server 
farm. 

FIG. 2 is a block diagram of one configuration of an 
extensible computing system 200 that includes a local 
20 computing grid. 

FIG. 3 is a block diagram of an exemplary virtual server 
farm featuring a SAN Zone. 
FIG. 4A, FIG. 4B, FIG. 4C, and FIG. 4D are block 
23 diagrams showing successive steps involved in adding a 
computing element and removing element from a virtual 
server farm. 

FIG. 5 is a block diagram of an embodiment of a virtual 
server farm system, computing grid, and supervisory mecha- 
30 nism. 

FIG. 6 is a block diagram of logical connections of a 
virtual server farm. 

FIG. 7 is a block diagram of logical connections of a 
virtual server farm. 

FIG. 8 is a block diagram of logical connections of a 
virtual server farm. 

FIG. 9 is a block diagram of a logical relationship 
between a control plane and a data plane. 
40 FIG. 10 is a state diagram of a master control election 
process. 

FIG. 11 is a state diagram for a slave control process. 
FIG. 12 is a state diagram for a master control process. 
45 FIG. 13 is a block diagram of a central control processor 
and multiple control planes and computing grids. 

FIG. 14 is a block diagram of an architecture for imple- 
menting portions of a control plane and a computing grid. 
FIG. 15 is a block diagram of a system with a computing 
50 grid that is protected by a firewall. 

FIG. 16 is a block diagram of an architecture for con- 
necting a control plane to a computing grid. 

FIG. 17 is a block diagram of an arrangement for enforc- 
ing tight binding between VLAN tags and IP addresses. 
55 FIG. 18 is a block diagram of a plurality of VSFs extended 
over WAN connections. 

FIG. 19 is a block diagram of a computer system with 
which an embodiment may be implemented. 

60 DETAILED DESCRIPTION OF THE 

INVENTION 

In the following description, for the purposes of 
explanation, numerous specific details are set forth in order 
65 to provide a thorough understanding of the present inven- 
tion. It will be apparent, however, to one skilled in the art 
that the present invention may be practiced without these 
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specific details. In other instances, well-known structures Thus, a VSF allows organizations to work with computing 

and devices are shown in block diagram form in order to facilities that appear to comprise a private server farm, 

avoid unnecessarily obscuring the present invention. dynamically created out of a large-scale shared computing 

VIRTUAL SERVER FARM (VSF) infrastructure namely the computing grid. A control plane 

v ' s coupled with the computing architecture described herein 

According to one embodiment, a wide scale computing provides a private server farm whose privacy and integrity 
fabric ("computing grid") is provided. The computing grid is protected through access control mechanisms imple- 
may be physically constructed once, and then logically mented in the hardware of the devices of the computing grid, 
partitioned on demand. A part of the computing grid is Th e con trol plane controls the internal topology of each 
allocated to each of a plurality of enterprises or organiza- 1Q VSF. The control plane can take the basic interconnection of 
tions. Each organization's logical portion of the computing computers, network switches and storage network switches 
grid is referred to as a Virtual Server Farm (VSF). Each described herein and use them to create a variety of server 
organization retains independent administrative control of f arm configurations. These include but are not limited to, 
its VSF. Each VSF can change dynamically in terms of single-tier Web server farms front-ended by a load balancer, 
number of CPUs, storage capacity and disk and network J5 as well as multi-tier configurauons, where a Web server talks 
bandwidth based on real-time demands placed on the server ^ aD application server, which in turn talks to a database 
farm or other factors. Each VSF is secure from every other server. A variety of load balancing, multi-tiering and fire- 
organization's VSF, even though they are all logically ere- walling configurations are possible, 
ated out of the same physical computing grid. A VSF can be tttmp r d rn 
connected back to an Intranet using either a private leased ™ E COMPUTING GRID 
line or a Virtual Private Network (VPN), without exposing The computing grid may exist in a single location or may 
the Intranet to other organizations' VSFs. be distributed over a wide area. First this document 

An organization can access only the data and computing describes the computing grid in the context of a single 

elements in the portion of the computing grid allocated to it, building-sized network, composed purely of local area tech- 

thal is, in its VSF, even though it may exercise full (e.g. ^ nologies. Then the document describes the case where the 

super-user or root) administrative access to these computers computing grid is distributed over a wide area network 

and can observe all traffic on Local Area Networks (LANs) (WAN). 

to which these computers are connected. According to one FIG. 2 is a block diagram of one configuration of an 

embodiment, this is accomplished using a dynamic firewall- extensible computing system 200 that includes a local 

ing scheme, where the security perimeter of the VSF 30 computing grid 208. In this document "extensible" generally 

expands and shrinks dynamically. Each VSF can be used to means that the system is flexible and scalable, having the 

host the content and applications of an organization that may capability to provide increased or decreased computing 

be accessed via the Internet, Intranet or Extranet. power to a particular enterprise or user upon demand. The 

Configuration and control of the computing elements and local computing grid 208 is composed of a large number of 
their associated networking and storage elements is per- 35 computing elements CPU1, CPU2, . . . CPUn, In an exem- 
formed by a supervisory mechanism that is not directly plary embodiment, there may be 10,000 computing 
accessible through any of the computing elements in the elements, or more. These computing elements do not contain 
computing grid. For convenience, in this document the or store any long-lived per-element state information, and 
supervisory mechanism is referred to generally as a control therefore may be configured without persistent or non- 
plane and may comprise one or more processors or a 40 volatile storage such as a local disk. Instead, all long lived 
network of processors. Tlie supervisory mechanism may state information is stored separate from the computing 
comprise a Supervisor, Controller, etc. Other approaches elements, on disks DISK1, DISK2, . . . DISKn that are 
may be used, as described herein. coupled to the computing elements via a Storage Area 

The control plane is implemented on a completely inde- Network (SAN) comprising one or more SAN Switches 202. 
pendent set of computing elements assigned for supervisory 45 Examples of suitable SAN switches are commercially avail- 
purposes, such as one or more servers that may be inter- aD ^ e fr° m Brocade and Excel. 

connected in a network or by other means. The control plane All of the computing elements are interconnected to each 

performs control actions on the computing, networking and other through one or more VLAN switches 204 which can 

storage elements of the computing grid through special be divided up into Virtual LANs (VLANs). The VLAN 

control ports or interfaces of the networking and storage 50 switches 204 are coupled to the Internet 106. In general a 

elements in the grid. The control plane provides a physical computing element contains one or two network interfaces 

interface to switching elements of the system, monitors connected to the VLAN switch. For the sake of simplicity, 

loads of computing elements in the system, and provides in FIG. 2 all nodes are shown with two network interfaces, 

administrative and management functions using a graphical although some may have less or more network interfaces, 

user interface or other suitable user interface. 55 Many commercial vendors now provide switches supporting 

Computers used to implement the control plane are logi- VLAN functionality. For example, suitable VLAN switches 

cally invisible to computers in the computing grid (and arc commercially available from Cisco Systems, Inc. and 

therefore in any specific VSF) and cannot be attacked or Xtreme Networks. Similarly there are a large number of 

subverted in any way via elements in the computing grid or commercially available products to construct SANs, includ- 

from external computers. Only the control plane has physi- 60 m g F i bre Channel switches, SCSI-to-Fibre-Channel bridg- 

cal connections to the control ports on devices in the m S devices, and Network Attached Storage (NAS) devices, 

computing grid, which controls membership in a particular Control plane 206 is coupled by a SAN Control path, CPU 

VSF. The devices in the computing can be configured only Control path, and VLAN Control path to SAN switches 202, 

through these special control ports, and therefore computing CPUs CPU1, CPU2, CPUn, and VLAN Switches 204, 

elements in the computing grid are unable to change their 65 respectively. 

security perimeter or access storage or computing devices Each VSF is composed of a set of VLANs, a set of 

which they are not authorized to do. computing elements that are attached to the VLANs, and a 
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subset of the storage available on the SAN that is coupled to the Idle Pool. Different organizations can obtain computing 

the set of computing elements. The subset of the storage elements from the Idle Pool at different times in the day, as 

available on the SAN is referred to as a SAN Zone and is needed, thereby enabling each VSF to grow when required 

protected by the SAN hardware from access from computing and shrink when traffic falls down to normal. If many 

elements that are part of other SAN zones. Preferably, 5 different organizations continue to peak at the same time and 

VLANs that provide non-forgeable port identifiers are used thereby potentially exhaust the capacity of the Idle Pool, the 

to prevent one customer or end user from obtaining access Idle Pool can be increased by adding more CPUs and storage 

to VSF resources of another customer or end user. elements to it (scalability). The capacity of the Idle Pool is 

FIG. 3 is a block diagram of an exemplary virtual server engineered so as to greatly reduce the probability that, in 

farm featuring a SAN Zone. A plurality of Web servers WS1, 10 steady state, a particular VSF may not be able to obtain an 

WS2, etc., are coupled by a first VLAN (VLAN1) to a load additional computing element from the Idle Pool when it 

balancer (LB)/firewall 302. A second VLAN (VLAN2) needs to. 

couples the Internet 106 to the load balancer (LB)/firewall FIG. 4A, FIG. 4B, FIG. 4C, and FIG. 4D are block 

302. Each of the Web servers may be selected from among diagrams showing successive steps involved in moving a 

CPU1, CPU2, etc., using mechanisms described further 15 computing element in and out of the Idle Pool. Referring 

herein. The Web servers are coupled to a SAN Zone 304, first to FIG. 4A, assume that the control plane has logically 

which is coupled to one or more storage devices 306fl,306Z>. connected elements of the computing grid into first and 

At any given point in time, a computing element in the second VSFs labeled VSF1, VSF2. Idle Pool 400 comprises 

computing grid, such as CPU1 of FIG. 2, is only connected a plurality of CPUs 402, one of which is labeled CPUX. In 

to the set of VLANs and the SAN zone(s) associated with a 20 FIG. 4B, VSF1 has developed a need for an additional 

single VSF. A VSF typically is not shared among different computing element. Accordingly, the control plane moves 

organizations. The subset of storage on the SAN that CPUX from Idle Pool 400 to VSF1, as indicated by path 

belongs to a single SAN zone, and the set of VLANs 404. 

associated with it and the computing elements on these In FIG. 4C, VSF1 no longer needs CPUX, and therefore 
VLANs define a VSF. 25 the control plane moves CPUX out of VSF1 and back into 
By controlling the membership of a VLAN and the the Idle Pool 400, In FIG. 4D, VSF2 has developed a need 
membership of a SAN zone, control plane enforces a logical for an additional computing element. Accordingly, the con- 
partitioning of the computing grid into multiple VSFs. trol plane moves CPUX from the Idle Pool 400 to VSF2. 
Members of one VSF cannot access the computing or Thus, over the course of time, as traffic conditions change, 
storage resources of another VSF. Such access restrictions 30 a single computing element may belong to the Idle Pool 
are enforced at the hardware level by the VLAN switches, (FIG. 4A), then be assigned to a particular VSF (FIG. 4B), 
and by port-level access control mechanisms (e.g., zoning) then be placed back in the Idle Pool (FIG. 4C), and then 
of SAN hardware such as Fibre Channel switches and edge belong to another VSF (FIG. 4D). 

devices such as SCSI to Fibre Channel bridging hardware. ^ At each one of these stages, the control plane configures 
Computing elements that form part of the computing grid the LAN switches and SAN switches associated with that 
are not physically connected to the control ports or inter- computing element to be part of the VLANs and SAN zones 
faces of the VLAN switches and the SAN switches, and associated with a particular VSF (or the Idle Pool). Accord- 
therefore cannot control the membership of the VLANs or m g to one embodiment, in between each transition, the 
SAN zones. Accordingly, the computing elements of the ^ computing element is powered down or rebooted. When the 
computing grid cannot access computing elements not computing element is powered back up, the computing 
located in the VSF in which they are contained. element views a different portion of storage zone on the 

Only the computing elements that run the control plane SAN. In particular, the computing element views a portion 

are physically connected to the control ports or interface of of storage zone on the SAN that includes a bootable image 

the devices in the grid. Devices in the computing grid 45 of an operating system (e.g., Linux, NT, Solaris, etc.). The 

(computers, SAN switches and VLAN switches) can only be storage zone also includes a data portion that is specific to 

configured through such control ports or interfaces. This each organization (e.g., files associated with a Web server, 

provides a simple yet highly secure means of enforcing the database partitions, etc.). The computing element is also part 

dynamic partitioning of the computing grid into multiple of another VLAN which is part of the VLAN set of another 

VSFs. 50 VSF, so it can access CPUs, SAN storage devices and NAS 

Each computing element in a VSF is replaceable by any devices associated with the VLANs of the VSF into which 

other computing element. The number of computing it nas oeen transitioned. 

elements, VLANs and SAN zones associated with a given In a preferred embodiment, the storage zones include a 

VSF ay change over time under control of the control plane. plurality of pre-defined logical blueprints that are associated 

In one embodiment, the computing grid includes an Idle ss wit* 1 roles that may be assumed by the computing elements. 

Pool that comprises large number of computing elements Initially, no computing element is dedicated to any particular 

that are kept in reserve. Computing elements from the Idle role or task such as Web server, application server, database 

Pool may be assigned to a particular VSF for reasons such server, etc. The role of the computing element is acquired 

as increasing the CPU or memory capacity available to that from one of a plurality of pre-defined, stored blueprints, 

VSF, or to deal with failures of a particular computing 60 each of which defines a boot image for the computing 

element in a VSF. When the computing elements are con- elements that are associated with that role. The blueprints 

figured as Web servers, the Idle Pool serves as a large "shock may be stored in the form of a file, a database table, or any 

absorber" for varying or "bursty" Web traffic loads and other storage format that can associate a boot image location 

related peak processing loads. with a role. 

The Idle Pool is shared between many different 65 Thus, the movements of CPUX in FIG. 4 A, FIG.4B, FIG. 

organizations, and therefore it provides economies of scale, 4C, FIG. 4D are logical, not physical, and are accomplished 

since no single organization has to pay for the entire cost of by re -configuring VLAN switches and SAN Zones under 
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control of The control plane. Further, each computing ele- 
ment in the computing grid initially is essentially fungible, 
and assumes a specific processing role only after it is 
connected in a virtual server farm and loads software from 
a boot image. No computing element is dedicated to any 5 
particular role or task such as Web server, application server, 
database server, etc. The role of the computing element is 
acquired from one of a plurality of predefined, stored 
blueprints, each of which is associated with a role, each of 
which defines a boot image for the computing elements that 10 
are associated with that role. 

Since there is no long-lived state information stored in 
any given computing element (such as a local disk), nodes 
are easily moved between different VSFs, and can run 
completely different OS and application software. This also 15 
makes each computing element highly replaceable, in case 
of planned or unplanned downtime. 

A particular computing element may perform different 
roles as it is brought into and out of various VSFs. For 
example, a computing element may act as a Web server in 20 
one VSF, and when it is brought into a different VSF, it may 
be a database server, a Web load balancer, a Firewall, etc. It 
may also successively boot and run different operating 
systems such as Linux, NT or Solaris in different VSFs. 
Thus, each computing element in the computing grid is 25 
fungible, and has no static role assigned to it. Accordingly, 
the entire reserve capacity of the computing grid can be used 
to provide any of the services required by any VSF. This 
provides a high degree of availability and reliability to the 
services provided by a single VSF, because each server 30 
performing a particular service has potentially thousands of 
back-up servers able to provide the same service. 

Further, the large reserve capacity of the computing grid 
can provide both dynamic load balancing properties, as well 35 
as high processor availability. This capability is enabled by 
the unique combination of diskless computing elements 
interconnected via VLANs, and connected to a configurable 
zone of storage devices via a SAN, all controlled in real-time 
by the control plane. Every computing element can act in the 4Q 
role of-any required server in any VSF, and can connect to 
any logical partition of any disk in the SAN. When the grid 
requires more computing power or disk capacity, computing 
elements or disk storage is manually added to the idle pool, 
which may decrease over time as more organizations are 45 
provided VSF services. No manual intervention is required 
in order to increase the number of CPUs, network and disk 
bandwidth and storage available to a VSF. All such resources 
are allocated on demand from CPU, network and disk 
resources available in the Idle Pool by the control plane. 5Q 

A particular VSF is not subjected to manual reconfigura- 
tion. Only the computing elements in the idle pool are 
manually configured into the computing grid. As a result, a 
great potential hazard present in current manually con- 
structed server farms is removed. The possibility that human 55 
error in configuring a new server into a live server farm cao 
cause the server farm to malfunction, possibly resulting in 
loss of service to users of that Web site, is virtually elimi- 
nated. 

The control plane also replicates data stored in SAN 60 
attached storage devices, so that failure of any particular 
storage element does not cause a loss of service to any part 
of the system. By decoupling long-lived storage from com- 
puting devices using SANs, and by providing redundant 
storage and computing elements, where any computing 65 
element can be attached to any storage partition, a high 
degree of availability is achieved. 
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A DETAILED EXAMPLE OF ESTABLISHING A 
VIRTUAL SERVER FARM, ADDING A 
PROCESSOR TO IT, AND REMOVING A 
PROCESSOR FROM IT 

FIG. 5 is a block diagram of a computing grid and control 
plane mechanism according to an embodiment. With refer- 
ence to FIG. 5, the following describes the detailed steps that 
may be used to create a VSF, add nodes to it and delete nodes 
from it 

FIG. 5 depicts computing elements 502, comprising com- 
puters A through G coupled to VLAN capable switch 504. 
VLAN switch 504 is coupled to Internet 106, and the VLAN 
switch has ports VI, V2, etc. Computers A through G are 
further coupled to SAN switch 506, which is coupled to a 
plurality of storage devices or disks D1-D5. The SAN 
switch 506 has ports SI, S2, etc. A control plane mechanism 
508 is communicatively coupled by control paths and data 
paths to SAN switch 506 and to VLAN switch 504. The 
control plane is able to send control commands to these 
devices through the control ports. 

For the sake of simplicity and exposition, the number of 
computing elements in FIG. 5 is a small number. In practice, 
a large number of computers, e.g., thousands or more, and an 
equally large number of storage devices form the computing 
grid. In such larger structures, multiple SAN switches are 
interconnected to form a mesh, and multiple VLAN switches 
are interconnected to form a VLAN mesh. For clarity and 
simplicity, however, FIG. 5 shows a single SAN switch and 
a single VLAN switch. 

Initially, all computers A-G are assigned to the idle pool 
until the control plane receives a request to create a VSF. All 
ports of the VLAN switch are assigned to a specific VLAN 
which we shall label as VLAN I (for the idle zone). Assume 
that the control plane is asked to construct a VSF, containing 
one load balancer/firewall and two Web servers connected to 
a storage device on the SAN. Requests to control plane may 
arrive through a management interface or other computing 
element. 

In response, the control plane assigns or allocates CPU A 
as the load balancer/firewall, and allocates CPUs B and C as 
the Web servers. CPU A is logically placed in SAN Zone 1, 
and pointed to a bootable partition on a disk that contains 
dedicated load balancing/fire walling software. The term 
"pointed to" is used for convenience and is intended to 
indicate that CPU A is given, by any means, information 
sufficient to enable CPU A to obtain or locate appropriate 
software that it needs to operate. Placement of CPU A in 
SAN Zone 1 enables CPU A to obtain resources from disks 
that are controlled by the SAN of that SAN Zone. 

The load balancer is configured by the control plane to 
know about CPUs B and C as the two Web servers it is 
supposed to load balance. The firewall configuration pro- 
tects CPUs B and C against unauthorized access from the 
Internet 106. CPUs B and C are pointed to a disk partition 
on the SAN that contains a bootable OS image for a 
particular operating system (e.g., Solaris, Linux, NT etc) and 
Web server application software (e.g., Apache). The VLAN 
switch is configured to place ports vl and v2 on VLAN 1, 
and ports v3, v4, v5, v6 and v7 on VLAN 2. The control 
plane configures the SAN switch 506 to place Fibre-Channel 
switch ports si, s2, s3 and s8 into SAN zone 1. 

A description of how a CPU is pointed to a particular disk 
drive, and what this means for booting up and shared access 
to disk data, is provided further herein. 

FIG. 6 is a block diagram of the resulting the logical 
connectivity of computing elements, which are collectively 
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called VSF 1. Disk drive DDI is selected from among 
storage devices Dl, D2, etc. Once the logical structure as 
shown in FIG. 6 is achieved, CPUs A, B, C are given a 
power-up command. In response, CPU A becomes a dedi- 
cated load balancer/firewall -computing element, and CPUs 
B, C become Web servers. 

Now, assume that because of a policy-based rule, the 
control plane determines that another Web server is required 
in VSF 1. This may be caused, for example, by an increased 
number of requests to the Web site and the customer's plan 
permits at least three Web servers to be added to VSF 1. Or 
it may be because the organization that owns or operates the 
VSF wants another server, and has added it through an 
administrative mechanism, such as a privileged Web page 
that allows it to add more servers to its VSF. 

In response, the control plane decides to add CPU D to 
VSF 1. In order to do this, the control plane will add CPU 
D to VLAN 2 by adding ports v8 and v9 to VLAN 2. Also, 
CPU D's SAN port s4 is added to SAN zone 1. CPU D is 
pointed to a bootable portion of the SAN storage that boots 
up and runs as a Web server. CPU D also gets read-only 
access to the shared data on the SAN, which may consist of 
Web page contents, executable server scripts, etc. This way 
it is able to serve Web requests intended for the server farm 
much as CPUs B and C serve requests. The control plane 
will also configure the load balancer (CPU A) to include 
CPU D as part of the server set which is being load balanced. 

CPU D is now booted up, and the size of the VSF has now 
increased to three Web servers and 1 load balancer. FIG. 7 
is a block diagram of the resulting logical connectivity. 

Assume that the control plane now receives a request to 
create another VSF, which it will name VSF 2, and which 
needs two Web servers and one load balancer/firewall. The 
control plane allocates CPU E to be the load balancer/ 
firewall and CPUs F, G to be the Web servers. It configures 
CPU E to know about CPUs F, G as the two computing 
elements to load balance against. 

To implement this configuration, the control plane will 
configure VLAN switch 504 to include port vlO, vll in 
VLAN 1 (that is, connected to the Internet 106) and ports 
vl2, vl3 and vl4, vlS to be in VLAN 3. Similarly, it 
configures SAN switch 506 to include SAN ports s6 and s7 
and s9 in SAN zone 2. This SAN zone includes the storage 
containing the software necessary to run CPU E as a 
load-balancer and CPUs F and G as Web servers that use a 
shared read-only disk partition contained in Disk D2 in SAN 
zone 2. 

FIG. 8 is a block diagram of the resulting logical con- 
nectivity. Although two VSFs (VSF 1, VSF 2) share the 
same physical VLAN switch and SAN switch, the two VSFs 
are logically partitioned. Users who access CPUs B, C, D, or 
the enterprise that owns or operates VSF 1 can only access 
the CPUs and storage of VSF 1. Such users cannot access the 
CPUs or storage of VSF 2. This occurs because of the 
combination of the separate VLANs and the 2 firewalls on 
the only shared segment (VLAN 1), and the different SAN 
zones in which the two VSFs are configured. 

Further assume that later, the control plane decides that 
VSF 1 can now fall back down to two Web servers. This may 
be because the temporary increase in load on VSF 1 has 
decreased, or it may be because of some other administrative 
action taken. In response, the control plane will shut down 
CPU D by a special command that may include powering 
down the CPU. Once the CPU has shut down, the control 
plane removes ports v8 and v9 from VLAN 2, and also 
removes SAN port s4 from SAN zone 1. Port s4 is placed in 
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an idle SAN zone. The idle SAN zone may be designated, 
for example, SAN Zone I (for Idle) or Zone 0. 

Some time later, the control plane may decide to add 
another node to VSF 2. This may be because the load on the 

5 Web servers in VSF 2 has temporarily increased or it may be 
due to other reasons. Accordingly, the control plane decides 
to place CPU D in VSF 2, as indicated by dashed path 802. 
In order to do this, it configures the VLAN switch to include 
ports v8, v9 in VLAN 3 and SAN port s4 in SAN zone 2. 

10 CPU D is pointed to the portion of the storage on disk device 
2 that contains a bootable image of the OS and Web server 
software required for servers in VSF 2. Also, CPU D is 
granted read-only access to data in a file system shared by 
the other Web servers in VSF 2. CPU D is powered back up, 

15 and it now runs as a load-balanced Web server in VSF 2, and 
can no longer access any data in SAN zone 1 or the CPUs 
attached to VLAN 2. In particular, CPU D has no way of 
accessing any element of VSF 1, even though at an earlier 
point in time it was part of VSF 1. 

20 Further, in this configuration, the security perimeter 
enforced by CPU E has dynamically expanded to include 
CPU D. Thus, embodiments provide dynamic fire walling 
that automatically adjusts to properly protect computing 
elements that are added to or removed from a VSF. 

25 

For purposes of explanation, embodiments have been 
described herein in the context of port-based SAN zoning. 
Other types of SAN zoning may also be used. For example, 
LUN level SAN zoning may be used to create SAN zones 
3Q based upon logical volumes within disk arrays. An example 
product that is suitable for LUN level SAN zoning is the 
Volume Logics Product from EMC Corporation. 

DISK DEVICES ON THE SAN 

35 There are several ways by which a CPU can be pointed to 
a particular device on the SAN, for booting up purposes, or 
for accessing disk storage which needs to be shared with 
other nodes, or otherwise provided with information about 
where to find bootup programs and data. 

40 One way is to provide a SCSI-to-Fibre Channel bridging 
device attached to a computing element and a SCSI interface 
for the local disks. By routing that SCSI port to the right 
drive on the Fibre -Channel SAN, the computer can access 
the storage device on the Fibre-Channel SAN just as it 

45 would access a locally attached SCSI disk. Therefore, soft- 
ware such as boot-up software simply boots off the disk 
device on the SAN just as it would boot off a locally attached 
SCSI disk. 

Another way is to have a Fibre-Channel interface on the 
node and associated device -driver and boot ROM and OS 
software that permits the Fibre-Channel interface to be used 
as a boot device. 

Yet another way is to have an interface card (e.g., PCI bus 
5S or Sbus) which appears to be a SCSI or IDE device con- 
troller but that in turn communicates over the SAN to access 
the disk. Operating systems such as Solams in tegrally 
provide diskless boot Junctions that can be used in this a 
alternative. 

60 Typically there will be two kinds of SAN disk devices 
associated with a give n node. The first is one which is not 
logically shared with other computing elements, and con- 
stitutes what is normally a per-node root partition containing 
bootable OS images, local configuration files, etc. This is the 

65 equivalent of the root file system on a Unix system. 

The second kind of disk is shared storage with other 
nodes. The kind of sharing varies by the OS software 
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running on the CPU and the needs of the nodes accessing the implementation. Various aspects of control plane implemen- 

shared storage. If the OS provides a cluster file system that tation are described in more detail in the following sections: 

allows read/write access of a shared-disk partition between 1) control plane architecture; 2) master segment manager 

multiple nodes, the shared disk is mounted as such a cluster election; 3) administrative functions; and 4) policy and 

file system. Similarly, the system may use database software S security considerations. 

such as Oracle Parallel Server that permits multiple nodes j # Control Plane Architecture 

runnine in a cluster to have concurrent read/write access to . , t . . . . . 

l j j- i i u ■ j j • i * i j According to one embodiment, a control plane is imple- 

a shared disk. In such cases, a shared disk is already . . & tl t_ m. _ i 

designed into the base OS and application software. ™ Dted . M a cont ^ 1 P^ 55 hlerarch y con , tro1 P roces * 
^ rr hierarchy generally mcludes one or more master segment 
For operating systems where such shared access is not 10 manager mecnaDisms mat are communicatively coupled to 
possible, because the OS and associated applications cannot and co ntl0 \ on e or morc slave segment manager mecha- 
manage a disk device shared with other nodes, the shared nisms . Qne Qr mQre slave segmenl manager mechanisms 
disk can be mounted as a read-only device. For many Web control 0QC or more farm manag ers. The one or more farm 
applications, having readonly access to Web related files is managers manage one or more VSFs. The master and slave 
sufficient. For example, in Unix systems, a particular file « scgmcnt managcr mechanisms may be implemented in hard- 
system may be mounted as readonly. ware c i rcu i (ryj computer software, or any combination 

thereof. 

FIG. 9 is a block diagram 900 that illustrates a logical 

The configuration described above in connection with relationship between a control plane 902 and a computing 

FIG. 5 can be expanded to a large number of computing and grid 904 according to one embodiment. Control plane 902 

storage nodes by interconnecting a plurality of VLAN controls and manages computing, networking and storage 

switches to form a large switched VLAN fabric, and by elements contained in computing grid 904 through special 

interconnecting multiple SAN swilches to form a large control ports or interfaces of the networking and storage 

switched SAN mesh. In this case, a computing grid has the elements in computing grid 904. Computing grid 904 

architecture generally shown in FIG. 5, except that the includes a number of VSFs 906 or logical resource groups 

SAN/VLAN switched mesh contains a very large number of created in accordance with an embodiment as previously 

ports for CPUs and storage devices. A number of computing described herein. 

elements running the control plane can be physically con- According to one embodiment, control plane 902 includes 
nected to the control ports of the VLAN/SAN switches, as 3o a master segment manager 908, one or more slave segment 
described further below. Interconnection of multiple VLAN managers 910 and one or more farm managers 912. Master 
switches to create complex multi-campus data networks is segment manager 908, slave segment managers 910 and 
known in this field. See, for example, G. Haviland, "Design- farm managers 912 may be colocated on a particular com- 
ing High-Performance Campus Intranets with Multilayer puting platform or may be distributed on multiple computing 
Switching," Cisco Systems, Inc., and information available 35 platforms. For purposes of explanation, only a single master 
from Brocade. segment manager 908 is illustrated and described, however, 

oavt a n outttptt id c number of master segment managers 908 may be 

SAN ARCHITECTURE employed. 

The description assumes that the SAN comprises Fibre- Master segment manager 908 is communicatively 
Channel switches and disk devices, and potentially Fibre- 40 coupled to, controls and manages slave segment managers 
Channel edge devices such as SCSI-to-Fibre Channel 910. Each slave segment manager 910 is communicatively 
bridges. However, SANs may be constructed using alterna- coupled to and manages one or more farm managers 912. 
tive technologies, such as Gigabit Ethernet switches, or According to one embodiment, each farm manager 912 is 
switches that use other physical layer protocols. Id co-located on the same computing platform as the corre- 
p articular, there are efforts currently underway to construct 45 sponding slave segment managers 910 with which it is 
SANs over IP networks by running the SCSI protocol over communicatively coupled. Farm managers 912 establish, 
IP. The methods and architecture described above is adapt- configure and maintain VSFs 906 on computing grid 904. 
able to these alternative methods of constructing a SAN. According to one embodiment, each farm manager 912 is 
When a SAN is constructed by running a protocol like SCSI assigned a single VSF 906 to manage, however, farm 
over IP over a VLAN capable layer 2 environment, then 50 managers 912 may also be assigned multiple VSFs 906. 
SAN zones are created by mapping them to different Farm managers 912 do not communicate directly with each 
VLANs. other, but only through their respective slave segment man- 
Also, Network Attached Storage (NAS) may be used, agers 910. Slave segment managers 910 are responsible for 
which works over LAN technologies such as fast Ethernet or monitoring the status of their assigned farm managers 912. 
Gigabit Ethernet. With this option, different VLANs are 55 slave segment managers 910 restart any of their assigned 
used in place of the SAN zones in order to enforce security farm managers 912 that have stalled or failed, 
and the logical partitioning of the computing grid. Such Master segment manager 908 monitors the loading of 
NAS devices typically support network file systems such as VSFs 906 and determines an amount of resources to be 
Sun's NSF protocol, or Microsoft's SMB, to allow multiple allocated to each VSF 906. Master segment manager 908 
nodes to share the same storage. 60 then instructs slave segment managers 910 to allocate and 

de-allocate resources for VSFs 906 as appropriate through 

CONTROL PLANE IMPLEMENTATION farm manage rs 912. A variety of load balancing algorithms 

As described herein, control planes may be implemented may be implemented depending upon the requirements of a 

as one or more processing resources that are coupled to particular application and the invention is not limited to any 

control and data ports of the SAN and VLAN switches. A 65 particular load balancing approach, 

variety of control plane implementations may be used and Master segment manager 908 monitors loading informa- 

the invention is not limited to any particular control plane tion for the computing platforms on which slave segment 
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managers 910 and farm managers 912 are executing to master segment manager. Although there is generally a 
determine whether computing grid 904 is being adequately single master segment manager for a particular control 
serviced. Master segment manager 908 allocates and plane, there may be situations where it is advantageous to 
de-allocates slave segment managers 910 and instructs slave c i cc t two or more master segment managers to co-manage 
segment managers 910 to allocate and de-allocate farm 5 the slave segment managers in the control plane, 
managers 912 as necessary to provide adequate management Accordi to one embodinl e n ,, slave segment managers 
of computing end 904. According to one embodiment, A , , . . 4 , ° f 7. . 
master segmlnt manager 908 also manages the assignment m a , "f?* ^ ae < lcc f a mastcr segment manager for that 
of VSFs to farm managers 912 and the assignment of farm contro1 P lane ' In lhe ^ase where there is no master 
managers 912 to slave segment managers 910 as necessary in sc e mcnt mana e cr and onlv a sm S lc sIavc manager, 
to balance the load among farm managers 912 and slave 10 men me dave segment manager becomes the master seg- 
segment managers 910. According to one embodiment, slave mcnt manager and allocates additional slave segment man- 
segment managers 910 actively communicate with master a g ers as needed. If there are two or more slave segment 
segment manager 908 and request changes to computing managers, then the two or more slave processes elect a new 
grid 904 and to request additional slave segment managers master segment manager by vote, e.g., by a quorum. 
910 and/or farm managers 912. If a processing platform fails 15 Since slave segment managers in a control plane are not 
on which one or more slave segment managers 910 and one necessarily persistent, particular slave segment managers 
or more farm managers 912 are executing, then master may bc selected to participate in a vote. For example, 
segment manager 908 reassigns the VSFs 906 from the farm according to one embodiment, the register includes a times- 
managers 912 on the failed computing platform to other t for each slavc s t m r ^ is pcriodically 
farm managers 912. In this situation, master segment man- 20 dated b each sUye x t ma The &lave g t 

ager 908 may also instruct slave segment managers 910 to m. *■ * „ *u * u i_ , »i 

, 6 . . - « . . 1 c . i. , managers with timestamps that have been most recently 

initiate additional farm managers 912 to handle the reas- , , , , . . . ,. . - c , 1 »■ 

. n,f,r ft AJ r a i .t. l r updated, as determmed according to specified selection 

signment of VSFs 906. Actively managing the number of •« • „ *im 1 # ^uu *• j 1 *j 

& 4 . I, , \rcr ^ u catena, are most likely to still be executing and are selected 

computational resources allocated to VSFs 906, he number , o ^ for & new mas(er , * Fof a 

of active farm managers 912 and slave segment managers 25 spec ified number of the most recent slave segment managers 

910 allows overall power consumption to be controlled. For m ay be selected for a vote. 

example, to conserve power master segment manager 908 ' , ... , . 

may shutdown computing platforms that have no active According to another embodiment, an election sequence 

slave segment mangers 910 or farm managers 912. Trie nunlber 15 assI S Ded ,0 a " actave s^ve segment managers and 

power savings can be significant with large computing grids 30 a new m . aster m< ?^l 18 detennu, ed based upon 

904 and control planes 902. the elecUon sequence numbers for the active slave segment 
.... , ,. '. . , managers. For example, the lowest or highest election 
According to one embodiment, master segment manager ° . , .. . . , . 

„„„ " , . ... . f sequence number may be used to select a particular slave 

908 manages slave segment managers 910 using a registry. n , j_ . , n .\ . 

™ . f . • ■ c . u. ? i segment manager to be the next (or first) master segment 

The registry contains information about current slave seg- a a v / t>— 

ment managers 910 such as their state and assigned farm 35 mana 8 er - 

managers 912 and assigned VSFs 906. As slave segment 0nce a mas,er se 8 ment maDJ S et has ^ established, the 

managers 910 are allocated and de-allocated, the registry is slave se g ment nunagen in the same control plane as the 

updated to reflect the change in slave segment managers master manager periodically perform a health 

910. For example, when a new slave segment manager 910 check on lhe master se 8 ment mana 8 er bv contacting (ping) 

is instantiated by master segment manager 908 and assigned 40 & e current master segment manager to determine whether 

one or more VSFs 906, the registry is updated to reflect the ! he mas,er se 8 ment mana 8 er » sti11 active - If a determination 

creation of the new slave segment manager 910 and its I s made the c" 1 ™' master segment manager is no 

assigned farm managers 912 and VSFs 906. Master segment longer active, then a new master segment manager is 

manager 908 may then periodically examine the registry to elected. 

determine how to best assign VSFs 906 to slave segment 45 FIG. 10 depicts a state diagram 1000 of a master segment 

managers 910. manager election according to an embodiment. In state 

According to one embodiment, the registry contains infor- 1002 > which is ^ sUve segment manager main loop, the 

mation about master segment manager 908 that can be slave segment manager waits for the expiration of a ping 

accessed by slave segment managers 910. For example, the timer * U P on expiration of the ping timer, state 1004 is 

registry may contain data that identifies one or more active 50 entered. In state 1004, the slave segment manager pings the 

master segment managers 908 so that when a new slave master segment manager. Also in state 1004, timestamp (TS) 

segment manager 910 is created, the new slave segment for lhe sUve segment manager is updated. If the master 

manager 910 may check the registry to learn the identity of segment manager responds to the ping, then the master 

the one or more master segment managers 908. segment manager is still active and control returns to state 

The registry may be implemented in many forms and the 55 1002. If no response is received from the master segment 

invention is not limited to any particular implementation. mana S er after a s P ecified P enod of time ' lhen slate 1006 15 

For example, the registry may be a data file stored on a entered. 

database 914 within control plane 902. The registry may [n state 1006 » an active slave segment manager list is 

instead be stored outside of control plane 902. For example, obtained and control proceeds to state 1008. In state 1008, 

the registry may be stored on a storage device in computing 60 a cncck is madc to determine whether other slave segment 

grid 904. In this example, the storage device would be managers have also not received a response from the master 

dedicated to control plane 902 and not allocated to VSFs segment manager. Instead of sending messages to slave 

905 segment managers to make this determination, this infor- 
mation may be obtained from a database. If the slave 

2. Master Segment Manager Election 65 segment managers do not agree that master segment man- 
In general, a master segment manager is elected when a ager is no longer active, i.e., one or more of the slave 
control plane is established or after a failure of an existing segment managers received a timely response from the 
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master segment manager, then it is presumed that the current ager 910 does not respond in a specified period of time, 
master segment manager is still active and control returns to master segment manager 908 attempts to restart the particu- 
state 1002. If a specified number of the slave segment lar slave segment manager 910. If the particular slave 
managers have not received a timely response from the segment manager 910 cannot be restarted, then master 
current master segment manager, then it is assumed that the 5 segment manager 908 re-assigns the farm managers 912 
current master segment manager is "dead", i.e., no longer &om ^ failed slave segment manager 910 to another slave 
active, and control proceeds to state 1010. se e m ? nt manager 910. Master segment manager 908 may 
_ , . . . ... . j . then instantiate one or more additional slave segment man- 
In state 1010, the slave segment manager that initiated the agers 91Q {Q re . balance me process loading . According to 

process retrieves a current election number from an election onc cm b 0 diment, master segment manager 908 monitors the 
table and the next election number from a database. The ™ health of me computing platforms on which slave segment 
slave segment manager then updates the election table to managers 910 are executing. If a computing platform fails, 
include an entry that specifies the next election number and men master segment manager 908 reassigns the VSFs 
a unique address into a master election table. Control then assigned to farm managers 912 on the failed computing 
proceeds to state 1012 where the slave segment manager platform to farm managers 912 on another computing plat- 
reads the lowest sequence number for the current election 15 form. 

number. In state 1014, a determination is made whether the FIG. 12 is a state diagram 1200 for a master segment 

particular slave segment manager has the lowest sequence manager. Processing starts in a master segment manager 

number. If not, then control returns to state 1002. If so, then start state 1202. From state 1202, control proceeds to state 

control proceeds to state 1016 where the particular slave 1204 when master segment manager 908 makes a periodic 

segment manager becomes the master segment manager. 20 health check or request to slave segment managers 910 in 

Control then proceeds to state 1018 where the election control plane 902. From state 1204, if all slave segment 

number is incremented. managers 910 respond as expected, then control returns to 

A • . . , t state 1202. This occurs if all slave segment managers 910 

As described above, slave segment managers are gener- ide me ^ mformation t0 ^ m man . 

ally responsible for servicing their assigned VSFs and alio- * m indicating mat all slave segment managers 910 are 

eating new VSFs in response to instructions from the master operating normally. If one or more slave segment managers 

segment manager. Slave segment managers are also respon- 910 either don * t respon d, 0 r the response otherwise indicates 

sible for checking on the master segment manager and that onc or morc slavc scgment managers 910 have failed, 

electing a new master segment manager if necessary. tnen control proceeds to state 1206. 

FIG. 11 is a state diagram 1100 that iUustrates various 3Q [ n state 1206, master segment manager 908 attempts to 

states of a slave segment manager according to an embodi- restart the failed slave segment managers 910, This may be 

ment. Processing starts in a slave segment manager start accomplished in several ways. For example, master segment 

state 1102. From state 1102, control proceeds to state 1104 manager 908 may send a restart message to a non-responsive 

in response to a request to confirm the state of the current 0 r failed slave segment manager 910. From state 1206, if all 

master segment manager. In state 1104, the slave segment 35 s i ave segment managers 910 respond as expected, i.e., have 

manager sends a ping to the current master segment manager t» een successfully restarted, then control returns to state 

to determine whether the current master segment manager is 1202. For example, when a failed slave segment manager 

still active. If a timely response is received from the current 910 is successfully restarted, the slave segment manager 910 

master segment manager, the control proceeds to state 1106. sen( j s a restart confirmation message to master segment 

In state 1106, a message is broadcast to other slave segment 4Q manager 908. From state 1206, if one or more slave segment 

managers to indicate that the master segment manager managers have not been successfully restarted, then control 

responded to the ping. From state 1106, control returns to proceeds to state 1208. This situation may occur if master 

start state 1102. segment manager 908 does not receive a restart confirmation 

In state 1104 if no timely master response is received, then message from a particular slave segment manager 910. 

control proceeds to state 1108. In state 1108, a message is 45 i n state 1208, master segment manager 908 determines 

broadcast to other slave segment managers to indicate that the current loading of the machines on which slave segment 

the master segment manager did not respond to the ping. managers 910 are executing. To obtain the slave segment 

Control then returns to start state 1102. Note that if a manager 908 loading information, master segment manager 

sufficient number of slave segment managers do not receive 908 polls slave segment managers 910 directly or obtains the 

a response from the current master segment manager, then a 50 loading information from another location, for example 

new master segment manager is elected as described herein. f r0 m database 914. The invention is not limited to any 

From start state 1102, control proceeds to state 1110 upon particular approach for master segment manager 908 to 

receipt of a request from the master segment manager to obtain the loading information for slave segment managers 

restart a VSF. In state 1110, a VSF is restarted and control 910. 

returns to start state 1102. 55 Control then proceeds to state 1210 where the VSFs 906 
As described above, a master segment manager is gener- assigned to the failed slave segment managers 910 are 
ally responsible for ensuring that VSFs in the computing re -assigned to other slave segment managers 910. The slave 
grid controlled by the master segment manager are segment managers 910 to which the VSFs 906 are assigned 
adequately serviced by one or more slave segment manag- inform master segment manager 908 when the reassignment 
ers. To accomplish this, the master segment manager per- 60 has been completed. For example, slave segment managers 
forms regular health checks on all slave segment managers 910 may send a reassignment confirmation message to 
in the same control plane as the master segment manager. master segment manager 908 to indicate that the reassign- 
According to one embodiment, master segment manager ment of VSFs 906 has been successfully completed. Control 
908 periodically requests status information from slave remains in state 1210 until reassignment of all VSFs 906 
segment managers 910. The information may include, for 65 associated with the failed slave segment managers 910 has 
example, which VSFs 906 are being serviced by slave been confirmed. Once confirmed, control returns to state 
segment managers 910. If a particular slave segment man- 1202. 
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Instead of reassigning VSFs 906 associated with a failed 
slave segment manager 910 to other active slave segment 
managers 910, master segment manager 908 may allocate 
additional slave segment managers 910 and then assign 
those VSFs 906 to the new slave segment managers 910. The 5 
choice of whether to reassign VSFs 906 to existing slave 
segment managers 910 or to new slave segment managers 
910 depends, at least in part, on latencies associated with 
allocating new slave segment managers 910 and latencies 
associated with reassigning VSFs 906 to an existing slave 10 
segment manager 910. Either approach may be used depend- 
ing upon the requirements of a particular application and the 
invention is not limited to either approach. 

3. Administrative Functions 15 

According to one embodiment, control plane 902 is 
communicatively coupled to a global grid manager. Control 
plane 902 provides billing, fault, capacity, loading and other 
computing grid information to the global grid manager. FIG. 
13 is a block diagram 1300 that illustrates the use of a global 20 
grid manager according to an embodiment. 

In FIG. 13, a computing grid 1300 is partitioned into 
logical portions called grid segments 1302. Each grid seg- 
ment 1302 includes a control plane 902 that controls and 
manages a data plane 904. In this example, each data plane 
904 is the same as the computing grid 904 of FIG. 9, but are 
referred to as "data planes" to illustrate the use of a global 
grid manager to manage multiple control planes 902 and 
data planes 904, i.e., grid segments 1302. 3Q 

Each grid segment is communicatively coupled to a 
global grid manager 1304. Global grid manager 1304, 
control planes 902 and computing grids 904 may be 
co-located on a single computing platform or may be 
distributed across multiple computing platforms and the 35 
invention is not limited to any particular implementation. 

Global grid manager 1304 provides centralized manage- 
ment and services for any number of grid segments 1302. 
Global grid manager 1304 may collect billing, loading and 
other information from control planes 902 used in a variety 40 
of administrative tasks. For example, the billing information 
is used to bill for services provided by computing grids 904. 

4. Policy and Security Considerations 

As described herein, a slave segment manager in a control 45 
plane must be able to communicate with its assigned VSFs 
in a computing grid. Similarly, VSFs in a computing grid 
must be able to communicate with their assigned slave 
segment manager. Further, VSFs in a computing grid must 
not be allowed to communicate with each other to prevent 50 
one VSF from in any way causing a change in the configu- 
ration of another VSF. Various approaches for implementing 
these policies are described hereinafter. 

FIG. 14 is a block diagram 1400 of an architecture for 
connecting a control plane to a computing grid according to 55 
an embodiment. Control ("CTL") ports of VLAN switches 
(VLAN SW1 through VLAN SWn), collectively identified 
by reference numeral 1402, and SAN switches (SAN SW1 
through SAN SWn), collectively identified by reference 
numeral 1404, are connected to an Ethernet subnet 1406. 60 
Ethernet subnet 1406 is connected to a plurality of comput- 
ing elements (CPU1, CPU2 through CPUn), that are collec- 
tively identified by reference numeral 1408. Thus, only 
computing elements of control plane 1408 are communica- 
tively coupled to the control ports (CTL) of VLAN switches 65 
1402 and SAN switches 1404. This configuration prevents 
computing elements in a VSF (not illustrated), from chang- 



ing the membership of the VLANs and SAN zones associ- 
ated with itself or any other VSF. This approach is also 
applicable to situations where the control ports are serial or 
parallel ports. In these situations, the ports are coupled to the 
control plane 1408 computing elements. 

FIG. 15 is a block diagram 1500 of a configuration for 
connecting control plane computing elements (CP CPU1, 
CP CPU2 through CP CPUn) 1502 to data ports according 
to an embodiment. In this configuration, control plane 
computing elements 502 periodically send a packet to a 
control plane agent 1504 that acts on behalf of control plane 
computing elements 1502. Control plane agent 1504 peri- 
odically polls computing elements 502 for real-time data and 
sends the data to control plane computing elements 1502. 
Each segment manager in control plane 1502 is communi- 
catively coupled to a control plane (CP) LAN 1506. CP LAN 
1506 is communicatively coupled to a special port V17 of 
VLAN Switch 504 through a CP firewall 1508. This con- 
figuration provides a scalable and secure means for control 
plane computing elements 1502 to collect real-time infor- 
mation from computing elements 502. 

FIG. 16 is a block diagram 1600 of an architecture for 
connecting a control plane to a computing grid according to 
an embodiment. A control plane 1602 includes control plane 
computing elements CP CPU1, CP CPU2 through CP CPUn. 
Each control plane computing element CP CPU1, CP CPU2 
through CP CPUn in control plane 1602 is communicatively 
coupled to a port SI, S2 through Sn of a plurality of SAN 
switches that collectively form a SAN mesh 1604. 

SAN mesh 1604 includes SAN ports So, Sp that are 
communicatively coupled to storage devices 1606 that con- 
tain data that is private to control plane 1602. Storage 
devices 1606 are depicted in FIG. 16 as disks for purposes 
of explanation. Storage devices 1606 may be implemented 
by any type of storage medium and the invention is not 
limited to any particular type of storage medium for storage 
devices 1606. Storage devices 1606 are logically located in 
a control plane private storage zone 1608. Control plane 
private storage zone 1608 is an area where control plane 
1602 maintains log files, statistical data, current control 
plane configuration information and software that imple- 
ments control plane 1602. SAN ports So, Sp are only part of 
the control plane private storage zone and are never placed 
on any other SAN zone so that only computing elements in 
control plane 1602 can access the storage devices 1606. 
Furthermore, ports SI, S2 through Sn, So and Sp are in a 
control plane SAN zone that may only be communicatively 
coupled to computing elements in control plane 1602. These 
ports are not accessible by computing elements in VSFs (not 
illustrated). 

According to one embodiment, when a particular com- 
puting element CP CPU1, CP CPU2 through CP CPUn 
needs to access a storage device, or a portion thereof, that is 
part of a particular VSF, the particular computing element is 
placed into the SAN zone for the particular VSF. For 
example, suppose that computing element CP CPU 2 needs 
to access VSFi disks 1610. In this situation, port s2, which 
is associated with control plane CP CPU 2, is placed in the 
SAN zone of VSFi, which includes port Si. Once computing 
element CP CPU2 is done accessing the VSFi disks 1610 on 
port Si, computing element CP CPU2 is removed from the 
SAN zone of VSFi. 

Similarly, suppose computing element CP CPU 1 needs to 
access VSFj disks 1612. In this situation, computing element 
CP CPU 1 is placed in the SAN zone associated with VSFj. 
As a result, port SI is placed in the SAN zone associated 
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with VSFj, which includes the zone containing port Sj. Once to interconnect two or more VSFs across the WAN to make 

computing element CP CPU1 is done accessing the VSFj a single distributed VSF. 

disks 1612 connected to port Sj, computing element CP mirroring technologies can be used in order to have 

CPU1 is removed from the SAN zone associated with VSFj. i oca i copies of the data in a distributed VSF. Alternatively, 

This approach ensures the integrity of control plane com- s m e SAN is bridged over the WAN using one of several SAN 

puting elements and the control plane storage zone 1608 by to WAN bridging techniques, such as SAN-to-ATM bridging 

tightly controlling access to resources using tight SAN zone or SAN-to-Gigabit Ethernet bridging. SANs constructed 

control. over IP networks naturally extend over the WAN since IP 

As previously described, a single control plane computing works well over such networks, 

element may be responsible for managing several VSFs. 10 FIG. 18 is a block diagram of a plurality of VSFs extended 

Accordingly, a single control plane computing element must over WAN connections. A San Jose Center, New York 

be capable of manifesting itself in multiple VSFs Center, and London center are coupled by WAN connec- 

simultaneously, while enforcing firewalling between the tions. Each WAN connection comprises an ATM, ELAN, or 

VSFs according to policy rules established for each control VPN connection in the manner described above. Each center 

plane. Policy rules may be stored in database 914 (FIG. 9) 15 ^mp,.^ at i easl one VS p md at j east one Idle p ool p or 

of each control plane or implemented by central segment example, the San Jose center has VSFIA and Idle Pool A In 

manager 1302 (FIG. 13). 0^ configuration, the computing resources of each Idle Pool 

According to one embodiment, tight binding between of a center are available for allocation or assignment to a 

VLAN tagging and IP addresses are used to prevent spoofing VSF located in any other center. When such allocation or 

auacks by a VSF since (physical switch) port-based VLAN 20 assignment is carried out, a VSF becomes extended over the 

tags are not spoofable. An incoming IP packet on a given WAN. 
VLAN interface must have the same VLAN tag and IP 

address as the logical interface on which the packet arrives. EXAMPLE USES OF VSFS 

This prevents IP spooring attacks where a malicious server ™ ,. A . , , . 

,, pr c *tf in j j r • *u 25 The VSF architecture described in the examples above 

in a VSF spoofs the source IP address of a server in another ^ . 4 , . . r™ 

..pp j . ii j-n *u i ■ i . *, c .t. may be used m the context of Web server system. Thus, the 

VSF and potentially modifies the logical structure of another r • 1 r . , j . . 

VSF or otherwise subverts the security of computing grid fore 6 om S «anjpk» have been described in terms of Web 

functions. Circumventing this VLAN tagging approach se ™? SerVeR ™ 6 dat * b Jf ^""vcr 

, , * . -j * • r i out of the CPUs in a particular VSF. However, the VSF 

requires physical access to the computing grid which can be , / . V j. !i . 

* j u- v •* /™ a\ 1 r* * 30 architecture may be used m many other computmg contexts 

prevented using high security (Class A) data centers. JU , 4 . , 7 . . . , . J . . t ?. . , A 

. , a ^d to provide other kinds or services; it is not limited to 

A variety of network frame tagging formats may be used Web server sys tems. 
to tag data packets and the invention is not limited to any 

particular tagging format. According to one embodiment, ^ DISTRIBUTED VSF AS PART OF A 

IEEE 802.1q VLAN tags are used, although other formats CONTENT DISTRIBUTION NETWORK 
may also be suitable. In this example, a VLAN/IP address 

consistency check is performed at a subsystem in the IP In one embodiment, a VSF provides a Content Distribu- 

stack where 802.1q tag information is present to control tion Network (CDN) using a wide area VSF. The CDN is a 

access. In this example, computing elements are configured network of caching servers that performs distributed caching 

with a VLAN capable network interface card (NIC) in a of data. The network of caching servers may be 

manner that allows the computing elements to be commu- 40 implemented, for example, using TraflicServer (TS) soft- 

nicatively coupled to multiple VLANs simultaneously. ware commercially available from Inktomi Corporation, San 

FIG. 17 is a block diagram 1700 of an arrangement for Mateo > Calif - ^ * a cluster aware svstem i ^ svstera 

enforcing tight binding between VLAN tags and IP as more CPUs are added to a set of cachin S Traffic Server 

addresses according to an embodiment. Computing elements computing elements. Accordingly, it is well suited to a 

1702 and 1704 are communicatively coupled to ports vl and svstem in which addin S CPUs 15 ^ mechanism for scaling 

v2 of a VLAN switch 1706 via NICs 1708 and 1710, upwards. 

respectively. VLAN switch 1706 is also communicatively In this configuration, a system can dynamically add more 

coupled to access switches 1712 and 1714. Ports vl and v2 CPUs to that portion of a VSF that runs caching software 

are configured in tagged mode. According to one 5Q such as TS, thereby growing the cache capacity at a point 

embodiment, IEEE 802.1 q VLAN tag information is pro- close to where bursty Web traffic is occurring. As a result, a 

vided by VLAN switch 1706. CDN may be constructed that dynamically scales in CPU 
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and I/O bandwidth in an adaptive way. 



The VSF described above can be distributed over a WAN 55 A VSF ^PUCmoS^^'' 1 ' 

in several ways. 

In one alternative, a wide area backbone may be based on There is growing interest in offering Intranet applications 

Asynchronous Transfer Mode (ATM) switching. In this case, such as Enterprise Resource Planning (ERP), ORM and 

each local area VLAN is extended into a wide area using CRM software as hosted and managed services. Technolo- 

Emulated LANs (ELANs) which are part of the ATM LAN 60 g ies sucn as Citrix WinFrame and Citrix MetaFrame allow 

Emulation (LANE) standard. In this way, a single VSF can an enterprise to provide Microsoft Windows applications as 

span across several wide area links, such as ATM/SONET/ a service on a thin client such as a Windows CE device or 

OC-12 links. An ELAN becomes part of a VLAN which Web browser. A VSF can host such applications in a scalable 

extends across the ATM WAN. manner. 

Alternatively, a VSF is extended across a WAN using a 65 For example, the SAP R/3 ERP software, commercially 

VPN system. In this embodiment, the underlying character- available from SAP Aktiengesellschaft of Germany, allows 

istics of the network become irrelevant, and the VPN is used an enterprise to load balance using multiple Application and 
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Database Servers. In the case of a VSF, an enterprise would It will be apparent that the foregoing features offer signifi- 

dynamically add more Application Servers (e.g., SAP Dia- cant advantages over conventional manual approaches to 

log Servers) to a VSF in order to scale up the VSF based on constructing a server farm. In the conventional approaches, 

real-time demand or other factors. a user cannot automatically influence server farm's proper- 

Similarly, Citrix Metaframe allows an enterprise to scale s ties without going through a cumbersome manual procedure 

up Windows application users on a server farm running the of adding servers and configuring the server farm in various 

hosted Windows applications by adding more Citrix servers. ways. 

In this case, for a VSF the Citrix MetaFrame VSF would BILLING MODELS FOR A VSF 
dynamically add more Citnx servers in order to accommo- 
date more users of Metaframe hosted Windows applications. 10 Given the dynamic nature of a VSF, the enterprise that 
It will be apparent that many other applications may be hosts the computing grid and VSFs may bill service fees to 
hosted in a manner similar to the illustrative examples customers who own VSFs using a billing model for a VSF 
described above. which is based on actual usage of the computing elements 

and storage elements of a VSF. It is not necessary to use a 

CUSTOMER INTERACTION WITH A VSF 15 flat fee billing model. The VSF architecture and methods 

Since a VSF is created on demand, a VSF customer or Closed herein enable a "pay-as-you-go" billing model 

organization that "owns" the VSF may interact with the because me resources of a given VSF are not statically 

system in various ways in order to customize a VSF. For assigned. Accordingly, a particular customer having a highly 

example, because a VSF is created and modified instantly variablc load on its xrw farm ™ M savc monc y 

via the control plane, the VSF customer may be granted 20 because it would not be billed a rate associated with constant 

privileged access to create and modify its VSF itself. The P cak capacity, but rather, a rate that reflects a running 

privileged access may be provided using password authen- average of usage, instantaneous usage, etc. 

ticatioo provided by Web pages and security applications, For example, an enterprise may operate using a billing 

token card authentication, Kerberos exchange, or other model that stipulates a flat fee for a minimum number of 

appropriate security elements. 25 computing elements, such as 10 servers, and stipulates that 

In one exemplary embodiment, a set of Web pages are whcn rcal * timc load rcc 3 uires more &™ 10 elements, then 

served by the computing element, or by a separate server. ^ user * billed at 111 incremental rate for the extra servers, 

The Web pages enable a customer to create a custom VSF, based on how man y extra WIVcra were needed and for the 

by specifying a number of tiers, the number of computing len e th of tkne the y are needed - ^ units of such bills 

elements in a particular tier, the hardware and software 30 ma V reflect sources that are billed. For example, bills 

platform used for each element, and things such as what kind ma V be expressed in units such as MlPS-hours, CPU-hours, 

of Web server, application server, or database server soft- thousands of CPU seconds, etc. 

ware should be pre -con figured on these computing elements. ^ CUSTOMER VISIBLE CONTROL PLANE API 

Thus, the customer is provided with a virtual provisioning „ it . ____ 

, 35 In another alternative, the capacity or a VSF may be 
console 

A , ' .... controlled by providing the customer with an application 

After the customer or user enters such provisioning ^ mterface (API) ^ defines ^ to ^ 

informauon, the control plane parses and evaluates the older ]ane for ch m resour 4. Thus, an application program 

and queues ,t for execution Orders may be renewed by ^ coul(J ^ ^ of re usi 

human managers to ensure that they are appropriate Credit 4Q me ^ ^ ^ for more more storage> more 

checks of the enterprise may be run to ensure that .t has bandwidth etc ^ alternative may be used when the 

appropriate credit to pay for the requested services. If the needs ^ a Ucation t0 be aware of lhe 

provisioning order is approved the control plane may con- d id environm6nt and to take advanta g e 0 f the 

figure a VSF that matches the order, and return to the bilities offered b the contro , lane . 

customer a password providing root access to one or more ^ T , . , , , , . 

of the computing elements in the VSF. The customer may 45 Nothm S m the above^closed architecture requires the 

then upload master copies of applications to execute in the customer to ^ application for use with the comput- 

ygp ing grid. Existing applications continue to work as they do 

' , , . . , . in manually configured server farms. However, an applica- 

When the enterprise that hosts the computing gnd is a ^ can take advant of lhe dynarnism possible in the 

for-profit enterprise, the Web pages may also receive pay- 50 uti id ^ it has a ^ understanding of the 

meot related information, such as a credit card a PO uti resources it needs based on the real . Ume load 

number, electronic check, or other payment method. monitoring functions provided by the control plane. An API 

In another embodiment, the Web pages enable the cus- of the foregoing nature, which enables an application pro- 

tomer to choose one of several VSF service plans, such as gram t0 change lhe computing capacity of a server farm, is 

automatic growth and shrinkage of a VSF between a mini- 55 not possible using existing manual approaches to construct- 

mum and maximum number of elements, based on real-time mg a farm, 
load. The customer may have a control value that allows the 

customer to change parameters such as minimum number of AUTOMATIC UPDATING AND VERSIONING 

computing elements in a particular tier such as Web servers, Using the methods and mechanisms disclosed herein, the 

or a time period in which the VSF must have a minimal 60 control plane may carry out automatic updating and ver- 

amount of server capacity. The parameters may be linked to sioning of operating system software that is executed in 

billing software that would automatically adjust the custom- computing elements of a VSF. Thus, the end user or cus- 

er's bill rate and generate billing log file entries. tomer is not required to worry about updating the operating 

Through the privileged access mechanism the customer system with a new patch, bug fix, etc. The control plane can 

can obtain reports and monitor real-time information related 65 maintain a library of such software elements as they are 

to usage, load, hits or transactions per second, and adjust the received and automatically distribute and install them in 

characteristics of a VSF based on the real-time information. computing elements of all affected VSFs. 
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IMPLEMENTATION MECHANISMS 

The computing elements and control plane may be imple- 
mented in several forms and the invention is not limited to 
any particular form. In one embodiment, each computing 
element is a general purpose digital computer having the 
elements shown in FIG. 19 except for non-volatile storage 
device 1910, and the control plane is a general purpose 
digital computer of the type shown in FIG. 19 operating 
under control of program instructions that implement the 
processes described herein. 

FIG. 19 is a block diagram that illustrates a computer 
system 1900 upon which an embodiment of the invention 
may be implemented. Computer system 1900 includes a bus 
1902 or other communication mechanism for communicat- 5 
ing information, and a processor 1904 coupled with bus 
1902 for processing information. Computer system 1900 
also includes a main memory 1906, such as a random access 
memory (RAM) or other dynamic storage device, coupled to 
bus 1902 for storing information and instructions to be M 
executed by processor 1904. Main memory 1906 also may 
be used for storing temporary variables or other intermediate 
information during execution of instructions to be executed 
by processor 1904. Computer system 1900 further includes 
a read only memory (ROM) 1908 or other static storage ^ 
device coupled to bus 1902 for storing static information and 
instructions for processor 1904. A storage device 1910, such 
as a magnetic disk or optical disk, is provided and coupled 
to bus 1902 for storing information and instructions. 

Computer system 1900 may be coupled via bus 1902 to a 30 
display 1912, such as a cathode ray tube (CRT), for dis- 
playing information to a computer user. An input device 
1914, including alphanumeric and other keys, is coupled to 
bus 1902 for communicating information and command 
selections to processor 1904. Another type of user input 35 
device is cursor control 1916, such as a mouse, a trackball, 
or cursor direction keys for communicating direction infor- 
mation and command selections to processor 1904 and for 
controlling cursor movement on display 1912. This input 
device typically has two degrees of freedom in two axes, a 40 
first axis (e.g., x) and a second axis (e.g., y), that allows the 
device to specify positions in a plane. 

The invention is related to the use of computer system 
1900 for controlling an extensible computing system. 
According to one embodiment of the invention, controlling 45 
an extensible computing system is provided by computer 
system 1900 in response to processor 1904 executing one or 
more sequences of one or more instructions contained in 
main memory 1906. Such instructions may be read into main 
memory 1906 from another computer-readable medium, 50 
such as storage device 1910. Execution of the sequences of 
instructions contained in main memory 1906 causes proces- 
sor 1904 to perform the process steps described herein. One 
or more processors in a multi-processing arrarfgement may 
also be employed to execute the sequences of instructions 55 
contained in main memory 1906. In alternative 
embodiments, hard-wired circuitry may be used in place of 
or in combination with software instructions to implement 
the invention. Thus, embodiments of the invention are not 
limited to any specific combination of hardware circuitry 60 
and software. 

Toe term "computer-readable medium" as used herein 
refers to any medium that participates in providing instruc- 
tions to processor 1904 for execution. Such a medium may 
take many forms, including but not limited to, non-volatile 65 
media, volatile media, and traiismission media. Non-volatile 
media includes, for example, optical or magnetic disks, such 
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as storage device 1910. Volatile media includes dynamic 
memory, such as main memory 1906. Transmission media 
includes coaxial cables, copper wire and fiber optics, includ- 
ing the wires that comprise bus 1902. Transmission media 
can also take the form of acoustic or light waves, such as 
those generated during radio wave and infrared data com- 
munications. 

Common forms of computer-readable media include, for 
example, a floppy disk, a flexible disk, hard disk, magnetic 
tape, or any other magnetic medium, a CD-ROM, any other 
optical medium, punch cards, paper tape, any other physical 
medium with patterns of holes, a RAM, a PROM, and 
EPROM, a FLASH-EPROM, any other memory chip or 
cartridge, a carrier wave as described hereinafter, or any 
other medium from which a computer can read. 

Various forms of computer readable media may be 
involved in carrying one or more sequences of one or more 
instructions to processor 1904 for execution. For example, 
the instructions may initially be carried on a magnetic disk 
of a remote computer. The remote computer can load the 
instructions into its dynamic memory and send the instruc- 
tions over a telephone line using a modem. A modem local 
to computer system 1900 can receive the data on the 
telephone line and use an infrared transmitter to convert the 
data to an infrared signal. An infrared detector coupled to 
bus 1902 can receive the data carried in the infrared signal 
and place the data on bus 1902. Bus 1902 carries the data to 
main memory 1906, from which processor 1904 retrieves 
and executes the instructions. The instructions received by 
main memory 1906 may optionally be stored on storage 
device 1910 either before or after execution by processor 
1904. 

Computer system 1900 also includes a communication 
interface 1918 coupled to bus 1902. Communication inter- 
face 1918 provides a two-way data communication coupling 
to a network link 1920 that is connected to a local network 
1922. For example, communication interface 1918 may be 
an integrated services digital network (ISDN) card or a 
modem to provide a data communication connection to a 
corresponding type of telephone line. As another example, 
communication interface 1918 may be a local area network 
(LAN) card to provide a data communication connection to 
a compatible LAN. Wireless links may also be implemented. 
In any such implementation, communication interface 1918 
sends and receives electrical, electromagnetic or optical 
signals that carry digital data streams representing various 
types of information. 

Network link 1920 typically provides data communica- 
tion through one or more networks to other data devices. For 
example, network link 1920 may provide a connection 
through local network 1922 to a host computer 1924 or to 
data equipment operated by an Internet Service Provider 
(ISP) 1926. ISP 1926 in turn provides data communication 
services through the worldwide packet data communication 
network now commonly referred to as the "Internet" 1928. 
Local network 1922 and Internet 1928 both use electrical, 
electromagnetic or optical signals that carry digital data 
streams. The signals through the various networks and the 
signals on network link 1920 and through communication 
interface 1918, which carry the digital data to and from 
computer system 1900, are exemplary forms of carrier 
waves transporting the information. 

Computer system 1900 can send messages and receive 
data, including program code, through the network(s), net- 
work link 1920 and communication interface 1918. In the 
Internet example, a server 1930 might transmit a requested 
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code for an application program through Internet 1928, ISP 
1926, local network 1922 and communication interface 
1918. In accordance with the invention, one such down- 
loaded application provides for controlling an extensible 
computing system as described herein. 5 

The received code may be executed by processor 1904 as 
it is received, and/or stored in storage device 1910, or other 
non-volatile storage for later execution. In this manner, 
computer system 1900 may obtain application code in the 
form of a carrier wave. 

The computing grid disclosed herein may be compared 
conceptually to the public electric power network that is 
sometimes called the power grid. The power grid provides a 
scalable means for many parties to obtain power services 
through a single-wide-scale power infrastructure. Similarly, 
the computing grid disclosed herein provides computing 
services to many organizations using a single wide-scale 
computing infrastructure. Using the power grid, power con- 
sumers do not independently manage their own personal 
power equipment. For example, there is no reason for a 
utility consumer to run a personal power generator at its 
facility, or in a shared facility and manage its capacity and 
growth on an individual basis. Instead, the power grid 
enables the wide-scale distribution of power to vast seg- 
ments of the population, thereby providing great economies 
of scale. Similarly, the computing grid disclosed herein can 
provide computing services to vast segments of the popu- 
lation using a single wide-scale computing infrastructure. 

In the foregoing specification, the invention has been 3Q 
described with reference to specific embodiments thereof. It 
will, however, be evident that various modifications and 
changes may be made thereto without departing from the 
broader spirit and scope of the invention. The specification 
and drawings are, accordingly, to be regarded in an illus- 35 
trative rather than a restrictive sense. 

What is claimed is: 

1. A control apparatus comprising: 
a master control mechanism; and 

one or more slave control mechanisms communicatively 40 
coupled to the master control mechanism and being 
configured to, in response to one or more instructions 
from the master control mechanism, establish a first 
logical that contains a first subset of processing 
resources and a first subset of storage resources and is 45 
capable of operating independent of the master control 
mechanism and the one or more slave control 
mechanisms, wherein the first logical computing entity 
is established by: 

selecting the first subset of processing resources from a 50 

set of processing resources, 
selecting the first subset of storage resources from a set 

of storage resources, and 
causing the first subset of processing resources to be 

communicatively coupled to the first subset of stor- 55 

age resources. 

2. A control apparatus as recited in claim 1, wherein the 
master control mechanism is a master control process 
executing on one or more processors and the one or more 
slave control mechanisms are one or more slave processes 60 
executing on the one or more processors. 

3. A control apparatus as recited in claim 1, wherein the 
master control mechanism is one or more master processors 
and the one or more slave control mechanisms are one or 
more slave processors. 65 

4. A control apparatus as recited in claim 1, wherein the 
master control mechanism is configured to, based upon slave 



control process mechanism loading, dynamically reassign 
control, between the one or more slave control mechanisms, 
of one or more processing resources from the subset of 
processing resources and one or more storage resources 
from the subset of storage resources. 

5. A control apparatus as recited in claim 1, wherein the 
master control mechanism is configured to, based upon slave 
control process mechanism loading, dynamically allocate 
one or more additional slave control mechanisms, and assign 
control of one or more processing resources from the subset 
of processing resources and one or more storage resources 
from the subset of storage resources to the one or more 
additional slave control mechanisms. 

6. A control apparatus as recited in claim 1, wherein the 
master control mechanism is configured to, based upon slave 
control process mechanism loading, reassign control to one 
or more other slave control mechanisms from the one or 
more slave control mechanisms of one or more particular 
processing resources from the subset of processing resources 
and one or more particular storage resources from the subset 
of storage resources that were previously assigned to one or 
more particular slave control mechanisms from the one or 
more slave control mechanisms, and 

dynamically de-allocate the one or more particular slave 
control mechanisms. 

7. A control apparatus as recited in claim 1, wherein the 
master control mechanism is configured to: 

determine a status of the one or more slave control 
mechanisms, 

if one or more particular slave control mechanisms from 
the one or more slave control mechanisms are not 
responding or functioning correctly, then 
attempting to restart the one or more particular slave 

control mechanisms, and 
if the one or more particular slave control mechanisms 

cannot be restarted, then 

initiating one or more new slave control 
mechanisms, and 

reassigning control of processing resources and stor- 
age resources from the one or more particular 
slave control mechanisms to the one or more new 
slave control mechanisms. 

8. A control apparatus as recited in claim 1, wherein the 
one or more slave control mechanisms are configured to: 

determine a status of the master control mechanism, and 
if the master control mechanism has failed or is no longer 
functioning properly, elect a new master control 
mechanism from the one or more slave control mecha- 
nisms. 

9. A control apparatus as recited in claim 1, wherein the 
one or more instructions from the master control mechanism 
are generated based upon expected processing and storage 
requirements for the first logical computing entity. 

10. A control apparatus as recited in claim 1, wherein the 
one or more slave control mechanisms are further configured 
to, in response to the one or more instructions from the 
master control mechanism, perform the following: 

dynamically change the number of processing resources 
in the first subset of processing resources, 

dynamically change the number of storage resources in 
the first subset of storage resources, 

dynamically change the communicative coupling between 
the first subset of processing resources and the first 
subset of storage resources to reflect changes in the 
number of processing resources in the first subset of 
processing resources and the number of storage 
resources in the first subset of storage resources. 
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U. A control apparatus as recited in claim 1, wherein the 
one or more slave control mechanisms are further configured 
to, in response to the one or more instructions from the 
master control mechanism, establish a second logical com- 
puting entity that contains a second subset of processing 
resources and a second subset of storage resources and is 
capable of operating independent of the master control 
mechanism the one or more slave control mechanisms and 
the first logical computing entity, wherein the second logical 
computing entity is communicatively isolated from the first 
logical computing entity by: 

selecting the second subset of processing resources from 

the set of processing resources, 
selecting the second subset of storage resources from the 

set of processing resources, and 
causing the second subset of processing resources to be 
communicatively coupled to the second subset of stor- 
age resources. 

12. A control apparatus as recited in claim 1, wherein: 
the master control mechanism is communicatively 

coupled to a central control mechanism, 
the master control mechanism is configured to provide 
loading information for the first logical computing 
entity to the central control mechanism, and 
the master control mechanism is further configured to 
generate the one or more instructions for the one or 
more slave control mechanisms based upon one or 
more central control instructions received from the 
central control mechanism. 

13. A control apparatus as recited in claim 10, wherein 
changes to the number of processing resources in the first 
subset of processing resources and the number of storage 
resources in the first subset of storage resources is instructed 
by the master control mechanism based upon actual loading 
of the first subset of processing resources and first subset of 
storage resources. 

14. A control apparatus as recited in claim 11, wherein: 
the first subset of processing resources is communica- 
tively coupled to the first subset of storage resources 
using one or more storage area network (SAN) 
switches, 

the second subset of processing resources is communica- 
tively coupled to the second subset of storage resources 
using the one or more SAN switches, and 

the second logical resource group is communicatively 
isolated from the first logical resource group using 
tagging and SAN zoning. 

15. A control apparatus as recited in claim 14, wherein 
SAN zoning is performed using port-level SAN zoning or 
LUN level SAN zoning. 

16. A method for managing processing resources com- 
prising the steps of: 

initiating a master control mechanism; and 
initiating one or more slave control mechanisms commu- 
nicatively coupled to the master control mechanism and 
being configured to, in response to one or more instruc- 
tions from the master control mechanism, establish a 
first logical computing entity that contains a first subset 
of processing resources and a first subset of storage 
resources and is capable of operating independent of 
the master control mechanism and the one or more 
slave control mechanisms, wherein the first logical 
computing entity is established by: 
selecting the first subset of processing resources from a 
set of processing resources, 
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selecting the first subset of storage resources from a set 
of storage resources, and 

causing the first subset of processing resources to be 
communicatively coupled to the first subset of stor- 
age resources. 

17. A method as recited in claim 16, wherein: 
initiating a master control mechanism includes initiating 

a master control process executing on one or more 
processors, and 
initiating one or more slave control mechanisms includes 
initiating one or more slave processes executing on the 
one or more processors. 

18. A method as recited in claim 16, wherein: 
initiating a master control mechanism includes initiating 

one or more master control processors, and 
initiating one or more slave control mechanisms includes 
initiating one or more slave processors. 

19. A method as recited in claim 16, further comprising 
the master control mechanism dynamically reassigning 
control, based upon slave control process mechanism 
loading, between the one or more slave control mechanisms, 
of one or more processing resources from the subset of 
processing resources and one or more storage resources 
from the subset of storage resources. 

20. A method as recited in claim 16, further comprising 
the master control mechanism, based upon slave control 
process mechanism loading, 

dynamically allocating one or more additional slave con- 
trol mechanisms, and 

assigning control of one or more processing resources 
from the subset of processing resources and one or 
more storage resources from the subset of storage 
resources to the one or more additional slave control 
mechanisms. 

21. A method as recited in claim 16, further comprising 
the master control mechanism, based upon slave control 
process mechanism loading, 

reassigning control to one or more other slave control 
mechanisms from the one or more slave control mecha- 
nisms of one or more particular processing resources 
from the subset of processing resources and one or 
more particular storage resources from the subset of 
storage resources that were previously assigned to one 
or more particular slave control mechanisms from the 
one or more slave control mechanisms, and 

dynamically de- allocating the one or more particular slave 
control mechanisms, 

22. A method as recited in claim 16, further comprising 
the master control mechanism; 

determining a status of the one or more slave control 
mechanisms, if one or more particular slave control 
mechanisms from the one or more slave control mecha- 
nisms are not responding or functioning correctly, then 
attempting to restart the one or more particular slave 
control mechanisms, and if the one or more particu- 
lar slave control mechanisms cannot be restarted, 
then 

initiating one or more new slave control 
mechanisms, and 

reassigning control of processing resources and stor- 
age resources from the one or more particular 
slave control mechanisms to the one or more new 
slave control mechanisms. 

23. A method as recited in claim 16, further comprising 
the one or more slave control mechanisms: 

determining a status of the master control mechanism, and 
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if the master control mechanism has failed or is no longer 
functioning properly, electing a new master control 
mechanism from the one or more slave control mecha- 
nisms. 

24. A method as recited in claim 16, wherein the one or 
more instructions from the master control mechanism are 
generated based upon expected processing and storage 
requirements for the first logical computing entity. 

25. A method as recited in claim 16, further comprising 
the one or more slave control mechanisms, in response to the 
one or more instructions from the master control 
mechanism, performing the following: 

dynamically changing the number of processing resources 
in the first subset of processing resources, 

dynamically changing the number of storage resources in 
the first subset of storage resources, 

dynamically changing the communicative coupling 
between the first subset of processing resources and the 
first subset of storage resources to reflect changes in the 
number of processing resources in the first subset of 20 
processing resources and the number of storage 
resources in the first subset of storage resources. 

26. A method as recited in claim 16, further comprising 
the one or more slave control mechanisms, in response to the 
one or more instructions from the master control 
mechanism, establishing a second logical computing entity 
that contains a second subset of processing resources and a 
second subset of storage resources and is capable of oper- 
ating independent of the master control mechanism, the one 
or more slave control mechanisms and the first logical 
computing entity, wherein the second logical computing 
entity is communicatively isolated from the first logical 
computing entity, by: 

selecting the second subset of processing resources from 

the set of processing resources, 
selecting the second subset of storage resources from the 

set of processing resources, and 
causing the second subset of processing resources to be 

communicatively coupled to the second subset of stor- ^ 

age resources. 

27. A method as recited in claim 16, wherein: 

the master control mechanism is communicatively 
coupled to a central control mechanism, 

the master control mechanism is configured to provide 45 
loading information for the first logical computing 
entity to the central control mechanism, and 

the master control mechanism is further configured to 
generate the one or more instructions for the one or 
more slave control mechanisms based upon one or 50 
more central control instructions received from the 
central control mechanism. 

28. A method as recited in claim 25, wherein changes to 
the number of processing resources in the first subset of 
processing resources and the number of storage resources in 55 
the first subset of storage resources is instructed by the 
master control mechanism based upon actual loading of the 
first subset of processing resources and first subset of storage 
resources. 

29. A method as recited in claim 26, wherein: 60 
the first subset of processing resources is communica- 
tively coupled to the first subset of storage resources 
using one or more storage area network (SAN) 
switches, 

the second subset of processing resources is communica- 65 
tively coupled to the second subset of storage resources 
using the one or more SAN switches, and 
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the second logical resource group is communicatively 
isolated from the first logical resource group using 
tagging and SAN zoning. 

30. A method as recited in claim 29, wherein SAN zoning 
is performed using port-level SAN zoning or LUN level 
SAN zoning. 

31. A computer-readable medium carrying one or more 
sequences of one or more instructions for managing pro- 
cessing resources, wherein execution of the one or more 
sequences of one or more instructions by one or more 
processors causes the one or more processors to perform the 
steps of: 

initiating a master control mechanism; and 

initiating one or more slave control mechanisms commu- 
nicatively coupled to the master control mechanism and 
being configured to, in response to one or more instruc- 
tions from the master control mechanism, establish a 
first logical computing entity that contains a first subset 
of processing resources and a first subset of storage 
resources and is capable of operating independent of 
the master control mechanism and the one or more 
slave control mechanisms, wherein the first logical 
computing entity is established by: 
selecting the first subset of processing resources from a 
set of processing resources, 

selecting the first subset of storage resources from a set of 
storage resources, and 

causing the first subset of processing resources to be 
communicatively coupled to the first subset of storage 
resources. 

32. A computer-readable medium as recited in claim 31, 
wherein: 

initiating a master control mechanism includes initiating 

a master control process executing on one or more 

processors, and 
initiating one or more slave control mechanisms includes 

initiating one or more slave processes executing on the 

one or more processors. 

33. A computer-readable medium as recited in claim 31, 
wherein: 

initiating a master control mechanism includes initiating 
one or more master control processors, and 

initiating one or more slave control mechanisms includes 
initiating one or more slave processors. 

34. A computer-readable medium as recited in claim 31, 
further comprising one or more additional sequences of 
instructions which, when executed by the one or more 
processors, cause the master control mechanism to dynami- 
cally reassign control, based upon slave control process 
mechanism loading, between the one or more slave control 
mechanisms, of one or more processing resources from the 
subset of processing resources and one or more storage 
resources from the subset of storage resources. 

35. A computer-readable medium as recited in claim 31, 
further comprising one or more additional sequences of 
instructions which, when executed by the one or more 
processors, cause the master control mechanism to, based 
upon slave control process mechanism loading, 

dynamically allocate one or more additional slave control 
mechanisms, and assign control of one or more pro- 
cessing resources from the subset of processing 
resources and one or more storage resources from the 
subset of storage resources to the one or more addi- 
tional slave control mechanisms. 

36. A computer-readable medium as recited in claim 31, 
further comprising one or more additional sequences of 
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instructions which, when executed by the one or more 
processors, cause the master control mechanism to, based 
upon slave control process mechanism loading, 

reassign control to one or more other slave control mecha- 
nisms from the one or more slave control mechanisms s 
of one or more particular processing resources from the 
subset of processing resources and one or more par- 
ticular storage resources from the subset of storage 
resources that were previously assigned to one or more 
particular slave control mechanisms from the one or 10 
more slave control mechanisms, and 
dynamically de-allocate the one or more particular slave 
control mechanisms. 

37. A computer-readable medium as recited in claim 31, 
further comprising one or more additional sequences of 15 
instructions which, when executed by the one or more 
processors, cause the master control mechanism to: 

determine a status of the one or more slave control 

mechanisms, and 2Q 
if one or more particular slave control mechanisms from 
the one or more slave control mechanisms are not 
responding or functioning correctly, then 
attempt to restart the one or more particular slave 

control mechanisms, and 25 
if the one or more particular slave control mechanisms 
cannot be restarted, then 

initiate one or more new slave control mechanisms, 
and 

reassign control of processing resources and storage 30 
resources from the one or more particular slave 
control mechanisms to the one or more new slave 
control mechanisms. 

38. A computer-readable medium as recited in claim 31, 
further comprising one or more additional sequences of 35 
instructions which, when executed by the one or more 
processors, cause the one or more slave control mechanisms 
to: 

determine a status of the master control mechanism, and 
if the master control mechanism has failed or is no longer 40 
functioning properly, elect a new master control 
mechanism from the one or more slave control mecha- 
nisms. 

39. A computer-readable medium as recited in claim 31, 
wherein the one or more instructions from the master control 45 
mechanism are generated based upon expected processing 
and storage requirements for the first logical computing 
entity. 

40. A computer-readable medium as recited in claim 31, 
further comprising one or more additional sequences of 50 
instructions which, when executed by the one or more 
processors, cause the one or more slave control mechanisms, 

in response to the one or more instructions from the master 
control mechanism, performing the following: 

dynamically changing the number of processing resources 

in the first subset of processing resources, 
dynamically changing the number of storage resources in 

the first subset of storage resources, 
dynamically changing the communicative coupling 60 
between the first subset of processing resources and the 
first subset of storage resources to reflect changes in the 
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number of processing resources in the first subset of 
processing resources and the number of storage 
resources in the first subset of storage resources. 

41. A computer-readable medium as recited in claim 31, 
further comprising one or more additional sequences of 
instructions which, when executed by the one or more 
processors, cause the one or more slave control mechanisms 
to, in response to the one or more instructions from the 
master control mechanism, establish a second logical com- 
puting entity that contains a second subset of processing 
resources and a second subset of storage resources and is 
capable of operating independent of the master control 
mechanism, the one or more slave control mechanisms and 
the first logical computing entity, wherein the second logical 
computing entity is communicatively isolated from the first 
logical computing entity by: 

selecting the second subset of processing resources from 

the set of processing resources, 
selecting the second subset of storage resources from the 

set of processing resources, and 

causing the second subset of processing resources to he 
communicatively coupled to the second subset of stor- 
age resources, 

42. A computer-readable medium as recited in claim 31, 
wherein: 

the master control mechanism is communicatively 
coupled to a central control mechanism, 

the master control mechanism is configured to provide 
loading information for the first logical computing 
entity to the central control mechanism, and 

the master control mechanism is further configured to 
generate the one or more instructions for the one or 
more slave control mechanisms based upon one or 
more central control instructions received from the 
central control mechanism. 

43. A computer-readable medium as recited in claim 40, 
wherein changes to the number of processing resources in 
the first subset of processing resources and the number of 
storage resources in the first subset of storage resources is 
instructed by the master control mechanism based upon 
actual loading of the first subset of processing resources and 
first subset of storage resources. 

44. A computer-readable medium as recited in claim 41, 
wherein: 

the first subset of processing resources is communica- 
tively coupled to the first subset of storage resources 
using one or more storage area network (SAN) 
switches, 

the second subset of processing resources is communica- 
tively coupled to the second subset of storage resources 
using the one or more SAN switches, and 

the second logical resource group is communicatively 
isolated from the first logical resource group using 
tagging and SAN zoning. 

45. A computer-readable medium as recited in claim 44, 
wherein SAN zoning is performed using port-level SAN 
zoning or LUN level SAN zoning. 
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